{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "78e3457b",
   "metadata": {},
   "source": [
    "# Make Test Dataset\n",
    "\n",
    "When you encounter unexpected errors in the `power-grid-model`, you would like certainly to report the issue and debug (maybe by another developer) the calculation core with certain dataset. To make this possible, we have implemented a generic mechanism to export/import the dataset to/from JSON files, and to debug the calculation core in both Python and C++ with the test dataset. \n",
    "\n",
    "In this notebook we will learn how test datasets are made in this repository, including:\n",
    "\n",
    "* Structure of validation test datasets in this repository\n",
    "* Format of test datasets (JSON)\n",
    "* Use of helper functions to save and load the datasets"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "05a5bd6e",
   "metadata": {},
   "source": [
    "## Structure of Validation Datasets\n",
    "\n",
    "All validation test datasets are located in the [tests/data](https://github.com/PowerGridModel/power-grid-model/tree/main/tests/data) folder. The structure of the folder is as follows:\n",
    "\n",
    "```\n",
    "data\n",
    "   |\n",
    "   |\n",
    "   - power_flow\n",
    "             |\n",
    "             - power_flow_testset_1\n",
    "             - power_flow_testset_2\n",
    "             - ...\n",
    "   - state_estimation\n",
    "             |\n",
    "             - state_estimation_testset_1\n",
    "             - state_estimation_testset_2\n",
    "             - ...\n",
    "```\n",
    "\n",
    "The testsets are separated in two types of calculations: `power_flow` and `state_estimation`. In each folder there are subfolders for individual testset. The test datasets are used in both Python and C++ unit tests. Therefore, once you create extra test datasets in the folder, you can debug the program in both Python and C++."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4315ddc8",
   "metadata": {},
   "source": [
    "### Individual Dataset\n",
    "\n",
    "An individual dataset folder (in either `power_flow` or `state_estimation`) will consists of following files:\n",
    "\n",
    "* `params.json`: calculation parameters, mandatory\n",
    "* `input.json`: input data, mandatory\n",
    "* `sym_output.json`: reference symmetric output\n",
    "* `asym_output.json`: reference asymmetric output\n",
    "* `update_batch.json`: update batch data, mandatory if `sym_output_batch.json` or `asym_output_batch.json` exists.\n",
    "* `sym_output_batch.json`: reference symmetric batch output\n",
    "* `asym_output_batch.json`: reference asymmetric batch output\n",
    "\n",
    "The `params.json` and `input.json` are always needed. The test program (in Python and C++) will detect other files and instantiate relevant test calculations. For example, if `sym_output.json` exists, the test program will run a one-time symmetric calculation and compare the results to the reference results in `sym_output.json`."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "e5460d14",
   "metadata": {},
   "source": [
    "#### Test Parameters\n",
    "\n",
    "The `params.json` looks something like this:\n",
    "\n",
    "```json\n",
    "{\n",
    "  \"calculation_method\": \"iterative_linear\",\n",
    "  \"rtol\": 1e-8,\n",
    "  \"atol\": {\n",
    "    \"default\": 1e-8,\n",
    "    \".+_residual\": 1e-4\n",
    "  }\n",
    "}\n",
    "```\n",
    "\n",
    "You need to specify the method to use for the calculation, the relative and absolute tolerance to compare the calculation results with the reference results. For `rtol` you always give one number. For `atol` you can also give one number, or you can give a dictionary with regular expressions to match the attribute names. In this way you can have fine control of individual tolerance for each attribut (e.g. active/reactive power). In the example it has an absolute tolerance of `1e-4` for attributes which ends with `_residual` and `1e-8` for everything else.\n",
    "\n",
    "The `calculation_method` can be one string or list of strings. In the latter case, the test program will run the validation test mutilple times using all the specified methods.\n",
    "\n",
    "See [below](#detailed-configuration-with-the-paramsjson) for details."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "43d027cd",
   "metadata": {},
   "source": [
    "### JSON Data Format\n",
    "\n",
    "The data format is well explained in these resources\n",
    "[Serialization documentation](../user_manual/serialization.md) and some examples of Serialization are given in [Serialization notebook](./Serialization%20Example.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ef0e663c",
   "metadata": {},
   "source": [
    "### Empty Result File\n",
    "\n",
    "If you encounter a crash for a certain dataset. You can also create the input data into JSON files. In this case you might not have any reference result to compare, because you just need to find where the crash happens. You still need an empty (dictionary) result file to trigger the calculation.\n",
    "\n",
    "For `sym_output.json`:\n",
    "\n",
    "```json\n",
    "{\n",
    "  \"attributes\": {},\n",
    "  \"data\": {},\n",
    "  \"is_batch\": false,\n",
    "  \"type\": \"sym_output\",\n",
    "  \"version\": \"1.0\"\n",
    "}\n",
    "```\n",
    "\n",
    "For `sym_output_batch.json`:\n",
    "\n",
    "```json\n",
    "{\n",
    "  \"attributes\": {},\n",
    "  \"data\": [{}, {}, {}],\n",
    "  \"is_batch\": true,\n",
    "  \"type\": \"sym_output\",\n",
    "  \"version\": \"1.0\"\n",
    "}\n",
    "```\n",
    "\n",
    "    \n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "99220dfe",
   "metadata": {},
   "source": [
    "## Helper Functions to Import and Export\n",
    "\n",
    "In the module `power_grid_model.utils` we have some helper functions to import a json file to a `power-grid-model` compatible dataset, or the other way around. \n",
    "\n",
    "Please refer to the [documentation](../api_reference/python-api-reference.md) for detailed function signature.\n",
    "\n",
    "In this notebook we export the example network from [Power Flow](./Power%20Flow%20Example.ipynb) to json. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "b158e92f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# first build the network\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "from power_grid_model import AttributeType, ComponentType, DatasetType, LoadGenType, PowerGridModel, initialize_array\n",
    "\n",
    "# network\n",
    "\n",
    "# node\n",
    "node = initialize_array(DatasetType.input, ComponentType.node, 3)\n",
    "node[AttributeType.id] = [1, 2, 6]\n",
    "node[AttributeType.u_rated] = [10.5e3, 10.5e3, 10.5e3]\n",
    "\n",
    "# line\n",
    "line = initialize_array(DatasetType.input, ComponentType.line, 3)\n",
    "line[AttributeType.id] = [3, 5, 8]\n",
    "line[AttributeType.from_node] = [1, 2, 1]\n",
    "line[AttributeType.to_node] = [2, 6, 6]\n",
    "line[AttributeType.from_status] = [1, 1, 1]\n",
    "line[AttributeType.to_status] = [1, 1, 1]\n",
    "line[AttributeType.r1] = [0.25, 0.25, 0.25]\n",
    "line[AttributeType.x1] = [0.2, 0.2, 0.2]\n",
    "line[AttributeType.c1] = [10e-6, 10e-6, 10e-6]\n",
    "line[AttributeType.tan1] = [0.0, 0.0, 0.0]\n",
    "line[AttributeType.i_n] = [1000, 1000, 1000]\n",
    "\n",
    "# load\n",
    "sym_load = initialize_array(DatasetType.input, ComponentType.sym_load, 2)\n",
    "sym_load[AttributeType.id] = [4, 7]\n",
    "sym_load[AttributeType.node] = [2, 6]\n",
    "sym_load[AttributeType.status] = [1, 1]\n",
    "sym_load[AttributeType.type] = [LoadGenType.const_power, LoadGenType.const_power]\n",
    "sym_load[AttributeType.p_specified] = [20e6, 10e6]\n",
    "sym_load[AttributeType.q_specified] = [5e6, 2e6]\n",
    "\n",
    "# source\n",
    "source = initialize_array(DatasetType.input, ComponentType.source, 1)\n",
    "source[AttributeType.id] = [10]\n",
    "source[AttributeType.node] = [1]\n",
    "source[AttributeType.status] = [1]\n",
    "source[AttributeType.u_ref] = [1.0]\n",
    "\n",
    "# all\n",
    "input_data = {\n",
    "    ComponentType.node: node,\n",
    "    ComponentType.line: line,\n",
    "    ComponentType.sym_load: sym_load,\n",
    "    ComponentType.source: source,\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "73dba42b",
   "metadata": {},
   "source": [
    "### Export to JSON\n",
    "\n",
    "We can use the fuction `json_serialize_to_file` to export the input data to a json file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "724e098a",
   "metadata": {},
   "outputs": [],
   "source": [
    "import tempfile\n",
    "from pathlib import Path\n",
    "\n",
    "from power_grid_model.utils import json_serialize_to_file\n",
    "\n",
    "temp_path = Path(tempfile.gettempdir())\n",
    "json_serialize_to_file(temp_path / \"input.json\", input_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "071c790a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"version\": \"1.0\",\n",
      "  \"type\": \"input\",\n",
      "  \"is_batch\": false,\n",
      "  \"attributes\": {},\n",
      "  \"data\": {\n",
      "    \"node\": [\n",
      "      {\"id\": 1, \"u_rated\": 10500},\n",
      "      {\"id\": 2, \"u_rated\": 10500},\n",
      "      {\"id\": 6, \"u_rated\": 10500}\n",
      "    ],\n",
      "    \"line\": [\n",
      "      {\"id\": 3, \"from_node\": 1, \"to_node\": 2, \"from_status\": 1, \"to_status\": 1, \"r1\": 0.25, \"x1\": 0.20000000000000001, \"c1\": 1.0000000000000001e-05, \"tan1\": 0, \"i_n\": 1000},\n",
      "      {\"id\": 5, \"from_node\": 2, \"to_node\": 6, \"from_status\": 1, \"to_status\": 1, \"r1\": 0.25, \"x1\": 0.20000000000000001, \"c1\": 1.0000000000000001e-05, \"tan1\": 0, \"i_n\": 1000},\n",
      "      {\"id\": 8, \"from_node\": 1, \"to_node\": 6, \"from_status\": 1, \"to_status\": 1, \"r1\": 0.25, \"x1\": 0.20000000000000001, \"c1\": 1.0000000000000001e-05, \"tan1\": 0, \"i_n\": 1000}\n",
      "    ],\n",
      "    \"sym_load\": [\n",
      "      {\"id\": 4, \"node\": 2, \"status\": 1, \"type\": 0, \"p_specified\": 20000000, \"q_specified\": 5000000},\n",
      "      {\"id\": 7, \"node\": 6, \"status\": 1, \"type\": 0, \"p_specified\": 10000000, \"q_specified\": 2000000}\n",
      "    ],\n",
      "    \"source\": [\n",
      "      {\"id\": 10, \"node\": 1, \"status\": 1, \"u_ref\": 1}\n",
      "    ]\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "# we can display the json file\n",
    "with (temp_path / \"input.json\").open(\"r\") as f:\n",
    "    print(f.read())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6e85c767",
   "metadata": {},
   "source": [
    "### Import JSON\n",
    "\n",
    "We can use the fuction `json_deserialize_from_file` to import the input data from a json file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "c79d7216",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "   id  energized      u_pu             u   u_angle             p             q\n",
      "0   1          1  0.998988  10489.375043 -0.003039  3.121451e+07  6.991358e+06\n",
      "1   2          1  0.952126   9997.325181 -0.026031 -2.000000e+07 -5.000000e+06\n",
      "2   6          1  0.962096  10102.012975 -0.021895 -1.000000e+07 -2.000000e+06\n"
     ]
    }
   ],
   "source": [
    "# round trip and run power flow\n",
    "\n",
    "from power_grid_model.utils import json_deserialize_from_file\n",
    "\n",
    "imported_data = json_deserialize_from_file(temp_path / \"input.json\")\n",
    "\n",
    "pgm = PowerGridModel(imported_data)\n",
    "result = pgm.calculate_power_flow()\n",
    "\n",
    "print(pd.DataFrame(result[ComponentType.node]))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f2409710",
   "metadata": {},
   "source": [
    "## Import and Export Batch Update/Result Dataset\n",
    "\n",
    "You can use the same function to import and export batch update/result dataset for batch calculation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "37cd5ade",
   "metadata": {},
   "outputs": [],
   "source": [
    "# create batch set\n",
    "\n",
    "load_profile = initialize_array(DatasetType.update, ComponentType.sym_load, (3, 2))\n",
    "load_profile[AttributeType.id] = [[4, 7]]\n",
    "# this is a scale of load from 0% to 100%\n",
    "load_profile[AttributeType.p_specified] = [[30e6, 15e6]] * np.linspace(0, 1, 3).reshape(-1, 1)\n",
    "\n",
    "\n",
    "time_series_mutation = {ComponentType.sym_load: load_profile}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "89011a10",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"version\": \"1.0\",\n",
      "  \"type\": \"update\",\n",
      "  \"is_batch\": true,\n",
      "  \"attributes\": {},\n",
      "  \"data\": [\n",
      "    {\n",
      "      \"sym_load\": [\n",
      "        {\"id\": 4, \"p_specified\": 0},\n",
      "        {\"id\": 7, \"p_specified\": 0}\n",
      "      ]\n",
      "    },\n",
      "    {\n",
      "      \"sym_load\": [\n",
      "        {\"id\": 4, \"p_specified\": 15000000},\n",
      "        {\"id\": 7, \"p_specified\": 7500000}\n",
      "      ]\n",
      "    },\n",
      "    {\n",
      "      \"sym_load\": [\n",
      "        {\"id\": 4, \"p_specified\": 30000000},\n",
      "        {\"id\": 7, \"p_specified\": 15000000}\n",
      "      ]\n",
      "    }\n",
      "  ]\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "# export and print\n",
    "\n",
    "json_serialize_to_file(temp_path / \"update_batch.json\", time_series_mutation)\n",
    "\n",
    "with (temp_path / \"update_batch.json\").open(\"r\") as f:\n",
    "    print(f.read())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "8c4ad876",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[       0.        0.]\n",
      " [15000000.  7500000.]\n",
      " [30000000. 15000000.]]\n"
     ]
    }
   ],
   "source": [
    "# import round trip, calculate\n",
    "\n",
    "imported_batch_update = json_deserialize_from_file(temp_path / \"update_batch.json\")\n",
    "\n",
    "batch_result = pgm.calculate_power_flow(update_data=imported_batch_update)\n",
    "\n",
    "print(batch_result[ComponentType.sym_load][AttributeType.p])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3c018413",
   "metadata": {},
   "source": [
    "## Detailed configuration with the params.json\n",
    "\n",
    "### Validation cases with exceptions\n",
    "\n",
    "In certain cases, you may want to create test cases that either \"should raise an exception\" and/or \"currently have a behavior that differs from the intended one\". Examples are:\n",
    "\n",
    "* To create a test case for a known [explicitly forbidden input](../advanced_documentation/terminology.md#bad-input), such as an unobservable grid.\n",
    "  * In this case, the validation case _should raise an exception_.\n",
    "* To create a repro case for a bug.\n",
    "  * In this case, the validation case _currently has behavior that differs from the intended behavior_.\n",
    "* To create a test for behavior that is not implemented yet, like a new calculation method.\n",
    "  * In this case, the validation case _currently has behavior that differs from the intended behavior_.\n",
    "* To create a behavioral test for bad input when applying BDD (behavior driven development) practices.\n",
    "  * In this case, the validation case both _should raise an exception_ and _currently has behavior that differs from the intended behavior_.\n",
    "\n",
    "To support the two use cases, two additional keywords are exposed: `raises` (to denote that raising is intended behavior) and `xfail` (to denote that the current behavior differs from the intended behavior). The values are dicts that contain a `raises` phrase and a `reason` to denote the exact intended/expected exception type, as well as a human-readable explanation about why that exception is intended/expected. Only exceptions known to the PGM tests can be added to the configuration. Note that the list of known exceptions is non-exhaustive, so it may be necessary to add a new exception type to the list of known exceptions.\n",
    "\n",
    "In addition to the PGM exception types and some Python built-in exception types, there are two types worth explicitly mentioning:\n",
    "\n",
    "* `AssertionError` can be used to denote tests in which the actual values are different from the expected values.\n",
    "* `Failed` can be used to denote tests that should raise (`raises`) but actually do not raise any exceptions at all (`xfail`).\n",
    "  * **NOTE:** If `_pytest.outcomes.Failed` is not found, there is a fall-back to the default [`pytest.mark.xfail`](https://docs.pytest.org/en/stable/reference/reference.html#pytest-mark-xfail) implementation. That means, that a test is marked `xfail` if it throws any exception that is not the intended exception, not `fail`. This is a known limitation of the current implementation.\n",
    "\n",
    "The following example shows how a test can be created intended to test future behavior.\n",
    "\n",
    "```json\n",
    "{\n",
    "  \"calculation_method\": \"iterative_linear\",\n",
    "  \"rtol\": 1e-8,\n",
    "  \"atol\": {\n",
    "    \"default\": 1e-8,\n",
    "    \".+_residual\": 1e-4\n",
    "  },\n",
    "  \"raises\": {\n",
    "    \"raises\": \"NotObservableError\",\n",
    "    \"reason\": \"The grid does not contain a voltage sensor\"\n",
    "  },\n",
    "  \"xfail\": {\n",
    "    \"raises\": \"SparseMatrixError\",\n",
    "    \"reason\": \"A sufficient observability check for meshed grids is not yet implemented. See also https://github.com/PowerGridModel/power-grid-model/issues/864.\"\n",
    "  }\n",
    "}\n",
    "```\n",
    "\n",
    "By default, the Power Grid Model test configuration accepts `XFAIL` cases - because they represent known issues - but rejects `XPASS` cases, i.e., the actual behavior was as intended, even though we expected it to be different. `XPASS` cases can happen because a known issue was fixed, or it can be the result of a newly introduced bug. It is up to the developer to resolve the conflict. Providing a good reason when marking something `xfail` can help with this decision. See https://docs.pytest.org/en/stable/reference/reference.html#pytest-mark-xfail for details.\n",
    "\n",
    "The following table shows how the different configurations can be used. Exception types `AError`, `BError` and `CError` denote exceptions known to the PGM tests.\n",
    "\n",
    "| Intended behavior | Expected behavior | Keywords                                                          |              Actual behavior: pass              |           Actually raises `AError`            |                                                         Actually raises `BError`                                                          |                                                         actually raises `CError`                                                          |\n",
    "| ----------------- | ----------------- | ----------------------------------------------------------------- | :---------------------------------------------: | :-------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------: |\n",
    "| pass              | pass              | `{}`                                                              | <span style=\"color : green\">**.** (PASS)</span> |  <span style=\"color : red\">F (AError)</span>  |                                                <span style=\"color : red\">F (BError)</span>                                                |                                                <span style=\"color : red\">F (CError)</span>                                                |\n",
    "| pass              | raises `AError`   | `{\"xfail\": {\"raises\": \"AError\"}}`                                 |   <span style=\"color : red\">F (XPASS)</span>    | <span style=\"color : orange\">x (XFAIL)</span> |                                                   <span style=\"color : red\">F (BError)                                                    |                                                <span style=\"color : red\">F (CError)</span>                                                |\n",
    "| raises `AError`   | pass              | `{\"raises\": {\"raises\": \"AError\"}, \"xfail\": {\"raises\": \"Failed\"}}` | <span style=\"color : orange\">x (XFAIL) </span>  |  <span style=\"color : red\">F (XPASS)</span>   | <span style=\"color : red\">F (BError)</span> if `_pytest.outcomes.Failed` is available, else <span style=\"color : orange\">x (XFAIL)</span> | <span style=\"color : red\">F (CError)</span> if `_pytest.outcomes.Failed` is available, else <span style=\"color : orange\">x (XFAIL)</span> |\n",
    "| raises `AError`   | raises `AError`   | `{\"raises\": {\"raises\": \"AError\"}}`                                |   <span style=\"color : red\">F (Failed)</span>   |  <span style=\"color : green\">. (PASS)</span>  |                                                   <span style=\"color : red\">F (BError)                                                    |                                                   <span style=\"color : red\">F (CError)                                                    |\n",
    "| raises `AError`   | raises `BError`   | `{\"raises\": {\"raises\": \"AError\"}, \"xfail\": {\"raises\": \"BError\"}}` |   <span style=\"color : red\">F (Failed)</span>   |  <span style=\"color : red\">F (XPASS)</span>   |                                                  <span style=\"color : orange\">x (XFAIL)                                                   |                                                <span style=\"color : red\">F (CError)</span>                                                |\n",
    "\n",
    "### Calculation method-specific configuration\n",
    "\n",
    "In rare cases, for instance when creating a new calculation method, calculation method-specific configuration may be necessary. This can be done via the `extra_params` keyword, which, for each overloaded calculation method, contains objects similar to the root `params.json`, but that is applied as a patch for that specific calculation method run. The following example illustrates that.\n",
    "\n",
    "```json\n",
    "{\n",
    "  \"calculation_method\": [\n",
    "    \"iterative_linear\",\n",
    "    \"newton_raphson\"\n",
    "  ],\n",
    "  \"rtol\": 1e-8,\n",
    "  \"atol\": {\n",
    "    \"default\": 1e-8,\n",
    "    \".+_residual\": 5e-4\n",
    "  },\n",
    "  \"extra_params\": {\n",
    "    \"newton_raphson\": {\n",
    "      \"experimental_features\": \"enabled\",\n",
    "      \"xfail\": {\n",
    "        \"raises\": \"SparseMatrixError\",\n",
    "        \"reason\": \"Current sensors are not yet implemented for this calculation method\"\n",
    "      }\n",
    "    }\n",
    "  }\n",
    "}\n",
    "```"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "power-grid-model",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.14.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}